Harald Lüngen , Alexander Mehler and Angelika Storrer Lexical - Semantic Resources in Automated Discourse Analysis
نویسندگان
چکیده
In this paper, we address the task of automatically determining which discourse relation holds between two text spans. We focus on relations that are not explicitly signalled by a discourse marker like but. While lexical models have been found useful for the task, they are also prone to data sparseness problems, which is a big drawback given the scarcity of discourse annotated data. We therefore investigate whether the use of lexical-semantic resources, such as WordNet, can be exploited to back-off to a more general representation of lexical information in cases were data are sparse. We compare such a semantic back-off strategy to morphological generalisations over word forms, such as stemming and lemmatising.
منابع مشابه
Introduction: Modeling, Learning and Processing of Text-Technological Data Structures
Where you can find the modeling learning and processing of text technological data structures easily? Is it in the book store? On-line book store? are you sure? Keep in mind that you will find the book in this site. This book is very referred for you because it gives not only the experience but also lesson. The lessons are very valuable to serve for you, that's not about who are reading this mo...
متن کاملModelling and Processing Wordnets in OWL
In this contribution, we discuss and compare alternative options of modelling the entities and relations of wordnet-like resources in the Web Ontology Language OWL. Based on different modelling options, we developed three models of representing wordnets in OWL, i.e. the instance model, the class model, and the metaclass model. These OWL models mainly differ with respect to the ontological statu...
متن کاملWhat are Ontologies Good For? Evaluating Terminological Ontologies in the Framework of Text Graph Classification
This paper develops a graph-theoretical model of text representation based on lexical chaining. Other than present approaches to chaining, this model reflects the logical document structure of texts as well as semantic relations of their lexical constituents in order to compute text similarity values. By varying the terminological ontology used to induce such relations, a door is opened to syst...
متن کاملTowards an encoding standard for social media and CMC: Experiences from German and French corpus projects using TEI
Format of this submission: Our proposal of a mini panel includes two papers (Beißwenger et al.) and (Chanier et al.). If accepted, we would like to introduce the panel with a little introduction (10-15 minutes) to the basics of text encoding with the TEI framework and some general challenges in modeling CMC with TEI. We would then present and discuss the two papers (= 40 minutes presentation + ...
متن کاملDomain ontologies and wordnets in OWL: Modelling options
Wordnets are lexical reference systems that follow the design principles of the Princeton WordNet project (Fellbaum, ). Domain ontologies (or domain-specific ontologies such as GOLD, or the GENE Ontology) represent knowledge about a specific domain in a format that supports automated reasoning about the objects in that domain and the relations between them (Erdmann, ). In this paper, we...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009